EST-PAC HPC – a web portal for high-throughput EST annotation and protein sequence prediction

نویسندگان

  • Adam K.L. Wong
  • Andrzej M. Goscinski
  • Christophe Lefèvre
چکیده

Expressed Sequence Tags (ESTs) are short DNA sequences generated by sequencing the transcribed cDNAs coming from a gene expression. They can provide significant functional, structural and evolutionary information and thus are a primary resource for gene discovery. EST annotation basically refers to the analysis of unknown ESTs that can be performed by database similarity search for possible identities and database search for functional prediction of translation products. Such kind of annotation typically consists of a series of repetitive tasks which should be automated, and be customizable and amenable to using distributed computing resources. Furthermore, processing of EST data should be done efficiently using a high performance computing platform. In this paper, we describe an EST annotator, EST-PAC, which has been developed for harnessing HPC resources potentially from Grid and Cloud systems for high throughput EST annotations. The performance analysis of EST-PAC has shown that it provides substantial performance gain in EST annotation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leveraging EST Evidence to Automatically Predict Alternatively Spliced Genes, Master's Thesis, December 2006

Current methods for high-throughput automatic annotation of newly sequenced genomes are largely limited to tools which predict only one transcript per gene locus. Evidence suggests that 20-50% of genes in higher eukariotic organisms are alternatively spliced. This leaves the remainder of the transcripts to be annotated by hand, an expensive time-consuming process. Genomes are being sequenced at...

متن کامل

ESTAnnotator: a tool for high throughput EST annotation

In high throughput sequence analysis, it is often necessary to combine the results of contemporary bioinformatics tools, because no individual tool alone computes all the requested information. ESTAnnotator is a tool for the high throughput annotation of expressed sequence tags (ESTs) by automatically running a collection of bioinformatics applications. In the first step, a quality check is per...

متن کامل

ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences

We present a web-based server, called ESTpass, for processing and annotating sequence data from expressed sequence tag (EST) projects. ESTpass accepts a FASTA-formatted EST file and its quality file as inputs, and it then executes a back-end EST analysis pipeline consisting of three consecutive steps. The first is cleansing the input EST sequences. The second is clustering and assembling the cl...

متن کامل

Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus

MOTIVATION Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possib...

متن کامل

Portal Design, Synchrotron and HPC Services in e-HTPX – A resource for High Throughput Protein Crystallography

The e-HTPX portal/hub has been designed to provide a single point of access for the coordination and remote execution of protein crystallography experiments. The portal acts as an access gateway and Web Service response hub to key synchrotron laboratory data collection facilities, and Grid accessible HPC services intended to aid the process of determining protein structures. The portal greatly ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011